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We propose a bivariate model for a pair of dependent unit vectors which is generated by Brow- 
nian motion. Both marginals have uniform distributions on the sphere, while the conditionals 
follow so-called "exit" distributions. Some properties of the proposed model, including param- 
eter estimation and a pivotal statistic, are investigated. Further study is undertaken for the 
bivariate circular case by transforming variables and parameters into the form of complex num- 
bers. Some desirable properties, such as a multiplicative property and infinite divisibility, hold 
for this submodel. Two estimators for the parameter of the submodel are studied and a simu- 
lation study is carried out to investigate the finite sample performance of the estimators. In an 
attempt to produce more flexible models, some methods to generalize the proposed model are 
discussed. One of the generalized models is applied to wind direction data. Finally, we show how 
it is possible to construct distributions in the plane and on the cylinder by applying bilinear 
fractional transformations to the proposed bivariate circular model. 

Keywords: bivariate circular distribution; copula; exit distribution; wrapped Cauchy 
distribution 

1. Introduction 

In a variety of scientific fields, observations are described as pairs of d-dimensional unit 
vectors. In meteorology, for example, wind directions at the weather station in Milwaukee 
at 6 a.m. and noon (Johnson and Wehrly (1977)) are data of this type with d~2. Another 
example with d = 3 consists of directions of the magnetic field in a rock sample before 
and after some laboratory treatment (Stephens (1979)). 

For the analysis of data of this type, various stochastic models have been proposed in 
the literature. Some distributions with certain marginals or conditionals are seen in Mar- 
dia (1975), Wehrly and Johnson (1980), Rivest (1988, 1997), Downs and Mardia (2002), 
SenGupta (2004) and Shich and Johnson (2005). Among various works on distributions 
for bivariate angular data considerable attention has been paid to models with uniform 
marginals, which can be viewed as spherical equivalents of copulas. Johnson and Wehrly 
(1977) provided a bivariate circular distribution with uniform marginals and von Mises 
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conditionals. Saw (1983) introduced a bivariate family with uniform marginals for pairs 
of dependent unit vectors which is an offset distribution of the multivariate normal dis- 
tribution with some restrictions on parameters. Rivest (1984) discussed a certain class 
of distributions with so-called "0(<i)-synimetric" densities, which has uniform marginals. 
Recent work by Alfonsi and Brigo (2005) proposed new families of copulas based on 
periodic functions. 

The potential application of these special copulas is not restricted to the bivari- 
ate angular data whose marginals are uniformly distributed. Saw (1983) constructed a 
method which extends these copulas to distributions having any rotationally symmetric 
marginals. In the bivariate circular case, it is also possible to use a well-known technique 
in copula theory to generalize the model. These methods enable us to provide a bivariate 
model with prescribed marginals. 

The main purpose of this paper is to introduce a new distribution with uniform 
marginals which is generated by Revalued Brownian motion. To our knowledge, distri- 
butions for bivariate angular data have not previously been proposed based on Brownian 
motion. In this paper, a new approach is taken to provide a tractable model. This method 
enables us to define a class of bivariate distributions with uniform marginals and derive 
some desirable properties. We also discuss generalization of the proposed model. 

Section 2 suggests a model for two dependent unit vectors and Section 3 investigates 
properties of the proposed model, including parameter estimation and a pivotal statistic. 
In Section 4, we focus on the bivariate circular case of the model and discuss its properties 
in detail. It is shown that some desirable properties, such as a multiplicative property 
and infinite divisibility, hold for this submodel. Some properties of two estimators for 
the parameter of the submodel are studied by means of a simulation study. In Section 
5, generalizations of the proposed model are discussed and one of the extended models 
is applied to wind direction data. Finally, in Section 6, related models on K 2 and on the 
cylinder are constructed by applying bilinear fractional transformations to the proposed 
model. 

2. A model for a pair of unit vectors 
2.1. Definition of the proposed model 

Let {B t ; t > 0} be Revalued Brownian motion where d>2. Starting at Bq = 0, a Brown- 
ian particle will eventually hit a d-sphere with radius p (G (0, 1)). Let t± be the minimum 
time at which the particle exits the sphere, that is, t\ = 'mi{t; \\B t \\ = p}, where || • || is the 
Euclidean norm. After leaving the sphere with radius p, the particle will hit a unit sphere 
first at time T2, meaning t-j = inf{£; \\B t \\ = 1}. The proposed model is then defined by 
the joint distribution of a random vector 
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where Q is a member of 0(d), the group of orthogonal transformations on M. d . Note that 
the reason for multiplying Q by B ri /||S Tl || is to make the model more flexible without 
losing its tractability. 



2.2. Probability density function 

For convenience, write (U,V) = (QB Tl /\\B Tl \\,B T2 ). It is clear that (U,V) is a random 
vector where each variable takes values on the unit sphere. The joint distribution of 
(U, V) has density 

1 1 — 2 

where p e [0, 1), Q G 0(d), S^ 1 = {x G R d ; \\x\\ = i},x' is the transpose of x and A d ~\ 
is surface area of that is, A^-i = 2n d ' 2 /T(d/2). The domain of p is extended to 

include p = so that the model includes the uniform distribution. We write (U, V) ~ 
BSd(pQ) if a random vector (E/, V) has density (2.1). For the derivation of the density 
(2.1), sec Appendix A. 

The parameter p influences the dependence between U and V. When p = 0, U and 
V are independent and distributed as the uniform distribution on the sphere, that is, 
c(u,v) = 1/A 2 d _ l on u, v G S 1 . As p tends to 1, it can be shown that P(||f7 — QV\\ < 
e) — > 1 for any e > 0. We note that this property holds not only for density (2.1), but also 
for any 0(<i)-symmetric density with shape parameter k, namely, f(u,v) — g(u'Qv; k). 
For any such model, it holds that P(||C/ — QV\\ < e) is monotonically increasing with 
respect to k. 

As is clear from the form of (2.1), c(u, v) is a function of u'Qv, the inner product of u 
and Qv . From this fact, we easily find that the density (2.1) takes maximum (minimum) 
values for a given v at u = Qv (u = —Qv). The parameter Q thus controls the mode 
of the density. It is known that an orthogonal transformation Q involves two types 
of transformation, namely, rotation and/or reflection. In particular, when d = 2, these 
transformations can be expressed as 

cos6> -sin6>\ , /l 

•a a I v and v 1 y n 1 

sin# cos 6 J \0 — 1 

where < 9 < In. If detQ = 1, this transformation consists of only rotation. Otherwise, 
if detQ = —1, the transformation is made up of a reflection together with a rotation. 



3. Properties of and inference for the proposed model 
3.1. Marginals and conditionals 

One important feature of the proposed model is that it has well-known marginals and con- 
ditionals. Suppose (U,V) ~ BSd(pQ)- The density for this random vector, (2.1), is O(d)- 
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symmetric in the sense of Rivest (1984), Example 1. It then follows that the marginals 
of U and V are uniform distributions on S d ^ 1 with density 

Ai-i 

Hence, model (2.1) can be viewed as a copula on S' d_1 x One difference between 

this special copula and the usual ones is the periodicity of its variables. As discussed 
in Saw (1983), Section 4, it is possible to obtain the model with rotationally symmetric 
marginals from the one with uniform marginals. One can also generalize the bivariate 
circular model, that is, d = 2, so that both marginals have specified distributions by 
using copula theory. We discuss generalizations using the above techniques and some 
other methods in Section 5. 

Both conditional distributions of U given V = v and V given U = u are the exit 
distributions for the sphere. The terminology exit distribution is taken from Durrctt 
(1984), Section 1.10, and the exit distribution on S l , Exit^r?), is of the form 

/(*)= / \~ M L zeS" 2 " 1 , (3.1) 
A d - 1 \\x - r]\\ a 

where 77 € {(, € Mr; ||f|| < 1}. This distribution is unimodal and rotationally symmetric 
about x = rj/\\r]\\ with the concentration being controlled by In particular, when 
I ?/|| = 0, the distribution reduces to the spherical uniform. As ||?7|| — ► 1, it tends to a 
degenerate distribution concentrated at x = 77. It follows that the conditionals of the 
model with density (2.1) are U\(V = v) ~ Exit d (pQv) and V\(U =u) ~ Exit d (pQ'u). 

It is worth remarking that the conditional of W = v'Q'U given V = v has a family 
discussed by Leipnik (1947) and McCullagh (1989). The derivation of the density is clear 
from the latter paper or from Watson's result (Watson (1983), Equation 3.4.1). As in the 
latter paper, write X ~ H'(9, v) if the density of the random variable X is 

f( X ) = = - [ - -1<X<1 

T{ ) B(v + 1/2,1/2) (l-2fe + 2 )-+ 1 ' ^-^^ 
where -1 < 9 < 1 and v > -1/2. It then follows that W\(V = v) - H'{p, (d - 2)/2}. 



3.2. Some properties 



Here, we investigate some of the properties of the model with density (2.1). The first is 
that the distribution is closed under orthogonal transformations: 

{U,V)~BS d { P Q) => {QiU,Q 2 V)~BS d {pQ 1 QQ l 2 ), Q 1 ,Q 2 eO{d). 

The next result can be obtained by applying a result which appears in, for example, 
Durrett (1984), Section 1.10. 
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Theorem 1. Suppose that (U,V) is distributed as BSd(pQ). Let f be C 2 in D and 
continuous on D, where D — {£ £ R d ; ||£|| < 1}. If f is harmonic, namely, 

Q2 Q2 Q2 

dx\ dx\ dx d 
then E{f(V)\U = u} = f(pQ'u) and E{f{U)\V = v} = f(pQv). 

Using this fact, it is easy to show that E{f(U)} = E{f(V)} = f(0). 

The moments and correlation coefficient of the model are given by the following thco- 



Theorem 2. Suppose (U,V) has density (2.1). Then 

E(U)=E(V)=0, EiUU^^EiVV^^d" 1 !, 

(3.2) 

E(UV') = d- 1 pQ. 
The Johnson and Wehrly (1977) coefficient of correlation, pjw, is thus 

— \l/2 

Pjw = A ' = p, 

where A is the largest eigenvalue of S^yEj/ySyyEy-y, Sj/j/ = E(UU') — E(U)E(U'), 
Y, uv = E{UV) - E{U)E{V) and Y> vv = E{W) - E(V)E(V). 

See Appendix B for the proof. 

Note the simplicity of these moments and of the correlation coefficient. To our knowl- 
edge, no distributions for bivariatc angular data have such a simple correlation coefficient 
except for the uniform distribution. 

The following is useful for constructing a pivotal statistic for (p, Q), which is discussed 
in Section 3.5. The result is stated in its general form as follows. The proof is also given 
in Appendix B. 

Theorem 3. Assume (U, V) has 0(d) -symmetric density, that is, f(u,v)=g(vlQv), 
where Q € 0(d). The density of T = U'QV is then given by 

h{t) = A d _ x ■ A d . 2 g(t)(l - t 2 )( d - 3 )/ 2 , -1 < t < 1. 

From this theorem, it follows that if (U, V) - BS d (pQ), then U'QV ~ H'{p, (d- 1)/2}. 
Note that this result also enables us to obtain the density for X'QY of Saw's (1983) 
distribution immediately. 



3.3. Random vector simulation 

To generate a random vector having density (2.1), it is helpful to apply the idea which 
appears in, for instance, Saw (1983) and Watson (1983), Equation 2.2.1. We generalize 
the idea as follows. 
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Let W be a random variable from H'{p, (d-2)/2} and let d(X;() = (I-(C)X/\\(I- 
££')X\\, where £ S S' d_1 and X is a random vector having a uniform distribution on 
S ,d_1 . In other words, d(X; () has a uniform distribution on the (d— l)-sphere, S±, in M. d 
defined by S± = {q e R d ; \\v\\ = l,Cv = 0}- The conditional of U given V = v can then 
be decomposed into 



Given this, generation of variates from (2.1) can be carried out using the following 
three steps: (i) generate a random vector V which has a uniform distribution on S, 
achieved by using the method proposed by Tashiro (1977); (ii) generate W, which has 
H'{p, (d— 2)/2}, as stated in Section 4 of McCullagh (1989); (iii) finally, a random vector 
d(X;Qv) distributed as a uniform distribution on S± is obtained in a similar manner as 
in step (i) and one obtains a variate from the conditional of U given V = v as described 
in the preceding paragraph. The joint distribution of (U, V) is then BSd(pQ)- 

3.4. Parameter estimation 

Parameter estimation for multivariate distributions is often difficult. This is also the case 
for our model. However, one can discuss parameter estimation under certain conditions. 
Here, we consider parameter estimation based on the method of moments and maximum 
likelihood. 

First, the method of moments estimator is constructed from (3.2). Assume that (Uj, Vj) 
(j = 1, . . . ,n (> 2)) is a random sample from a distribution with density (2.1) with un- 
known parameters p and Q. Under the condition rank(^™ =1 UjV!) = d, one can construct 
an estimator for the parameters based on the moment E(UV). This is done by equating 
the theoretical and sample moments. We thus obtain 



The estimators of p and Q induced from the condition | detQ| — 1 arc then given by 



U\{V 



v 



) 4 WQv + (1 - W 2 ) 1/2 d{X; Qv). 




(3.3) 



3=1 




The estimator pQ has the following properties. For the proof, see Appendix B. 



Theorem 4. The following hold for the estimator pQ defined in (3.3): 
(i) pQ is an unbiased and consistent estimator of pQ; 
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(ii) if g is a function defined by g(A) = (a' l7 . . . , a' d )' , where A = (oi, . . . , ad) is a d x d 
matrix, then 



Vn~{g(pQ) - g(pQ)} JV(o,s) 



as n — » oo , 



(3.4) 



where £ = (<r TOn ) is 

l+^{(d-2)4-2}, m = r», 

d + 2 ^ qk3qil ~ 2(? y <?fe ')' otherwise, 

qij is the (i,j)th entry of Q, m = d(j — 1) + i and n = — 1) + fc, 1 < i,j, k, I < d. 

We note that although Q is an unbiased estimator of Q with |detQ| = 1, it is not 
necessarily an orthogonal matrix. 

Next, we consider maximum likelihood estimation. Let (Uj,Vj) (j = l,...,n) be an 
i.i.d. sample from BSd(pQ), where Q is known and p is unknown. The log- likelihood for 
p is given by 

j n 

l(p) = C + nlog(l - P 2 ) - 2 E " Zpu'jQVj + P 2 ), (3.5) 

3=1 

where C is a constant which does not depend on p. The derivative with respect to p is 

dl 2np v Xj — p 



dp 1 — p 2 1 — 2pxj + p 2 ' 

where Xj = u'^Qvj G [—1, 1]. From this expression, we find that the maximization of (3.5) 
with respect to p is essentially the same as that of H'{p, (d— 2)/2} with respect to p. 



3.5. Pivotal statistic 

Suppose (U, V) is a BSd(pQ) random vector. Define a random variable 

1 - (U'QV) 2 



T(p,Q) = 



1 ~2pU'QV + p 2 



It is easy to see that < T(p,Q) < 1 a.s. for any p and Q. As shown in Theorem 3, 
U'QV ~ H'{p,(d - 2)/2}. By using the results in McCullagh (1989), Section 4, and 
Abramowitz and Stegun (1972), equations (15.1.13) and (15.3.1), one then obtains 

FlT(n on - B{r + {d-l) /2,l/2) 
} - B((d-l)/2,l/2) ' 
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where B(-, •) is a beta function. Since these moments are equal to those of a beta distri- 
bution Beta((rf — l)/2, 1/2), it follows that T(p, Q) is a pivotal statistic for (p, Q) having 
a Beta((d — l)/2,l/2) distribution almost surely. 

4. Bivariate circular case 

4.1. Transformation of random vectors and parameters 

Thus far, we have considered properties of model (2.1) for the general dimensional case. 
In this section, we specifically discuss the bivariate circular case of the proposed model 
which possesses some unique properties. 

Suppose (U,V) ~ BS2(pQ)- Its density is then expressed as 

( \ 1 cQ i 

c{u,v) = -—z- — — z, u,v£b . 

y J 47t 2 1 - 2pu'Qv + p 2 

For ease of discussion, it will be helpful to transform the random variables and parameters 
by taking 

(Z u ,Z v ) = {U 1 + iU 2 ,V 1 + iV 2 ) and <P = pc w , 
where U = (Ux, U2) 1 , V = (Vx, V2)' and 6 is a constant satisfying 

„ / cos 9 — dct Q sin 0\ „ . n _ f A t\ 

Q=[ ^ , 0<6»<2tt. (4.1) 
V sin 6 det Q cos J 

It then follows that < 1 and Zu, Zv S f2, where J] = {z£C; |z| = 1}. The density for 
(Zu,Z v ) is given by 

1 1-H 2 

C(^u, Zu) = T~2 m J -dotQio ' Z«,^GO- (4.2) 

If (Zy, Zy) has density (4.2) with det Q = 1, we write (Zy, Zy) ~ BC+(i/j). Similarly, we 
write (Zc/jZy) - BC-(ip) if (Zy, Zy) has density (4.2) with detQ = -1. 

Note that this transformation does not actually change the distribution. All we have 
done is to express the random variables and the parameters in the form of complex 
numbers for the sake of further investigation of the distributions. 

As already stated in Section 2.2, the marginals of Zjj and Zy are circular uniform, 
whereas both conditionals of Z/y given Zy = z v and Zy given Zjj = z u are exit distribu- 
tions for the circle, that is, wrapped Cauchy distributions. For brevity, we introduce the 
notation C*{<j>) derived from McCullagh (1996) which denotes the wrapped Cauchy or 
circular Cauchy distribution with density 

« 2 > = ^CT «*M<1- 
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The relationship \<f>\ = ||£j| and arg(0) = arg(£i + i£ 2 ), where £ = (£1,^2)', holds between 
the parameters of model (3.1) and those of the density above via a transformation 
Z =Xi +iX 2 . See McCullagh (1996), Mardia and Jupp (2000), pp. 51-52, and Jam- 
malamadaka and ScnGupta (2001), pp. 45-46, for further properties of the wrapped 
Cauchy and circular Cauchy distribution. For model (4.2), it is easy to show that 
ZuftZy = Zv) ~ C*(^z v ) and Z V \(Z V = z u ) ~ C*(^z u ). 

4.2. Some properties 

To investigate other properties of the model, it is useful to calculate its moments. Assume 
that (Zjj,Zv) has BC+(ip). The moments for (Zu,Zy) are then obtained, by applying 
Rudin (1987), Theorem 11.13, as 

(r, j = ~k>0, 
E(Z^Z v k ) = j = —k < 0, for j, k G Z. (4.3) 

1 0, otherwise, 

Similarly, we can obtain the moments for BC '_ . According to Fourier scries expansion 
theory, one can recover the density from these moments if the density / satisfies / G 
L 2 (n x fi). See Dym and McKean (1972), Section 1.10, for details. 

Using these results, the following properties are established. First, the BC+ model has 
the multiplicative property 

(4.4) 

=> (ZuiZu2,ZviZv2) ~ BC+Wifa)- 

Likewise, it can be shown that the BC- model also has this multiplicative property. 
However, the convolution of BC+ and BC- is the uniform distribution, that is, 

(Z U1 ,Z V1 ) ~ BC + (tPi)±(Z U2 , Z V2 ) ~ BC.(^ 2 ) 
(Zu\Zu2, Zv\Zv2) ~ BC + (0). 

In addition, 

(Zu,Z v )~BC ± (>tp) {Zu n ,Z v n )~ BC±{i; n ) for any n G N. 

As n tends to infinity, the distribution of (Zu™, Zy n ) tends to a uniform distribution on 
the torus. 

Furthermore, model (4.2) is infinitely divisible with respect to multiplication. This can 
be proven as follows. Let (Zu, Zy) ~ BC±(ip). For any positive integer n, the assumption 
that (Zu j , Zy j) (j = 1, . . . , n) is an i.i.d. sample from BC '±( n y^fj) then yields 



n n \ 

JZ Uj ,l[Zy j )l(Zu,Zv). 



(4.5) 
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4.3. Random vector simulation 

In order to simulate a BC+(ip) random vector, one could generate R 2 -valucd Brownian 
motion and record the points at which the Brownian particle hits circles with radii p and 
1. However, this algorithm is somewhat inefficient because we need to simulate Brownian 
motion at least up to the time at which the particle hits the unit circle. Another possibility 
is discussed in Section 3.3, but it, too, is less efficient than the method proposed below. 
The focus of this subsection is therefore to discuss an algorithm for simulating BC+(ijj) 
variates which we conclude to be more appealing than the aforementioned methods. 

To obtain the random vector, we use the fact that the marginal of Zjj is circular uniform 
and the conditional of Zy given Z\j = z u is wrapped Cauchy, specifically, C*{ipz u ). For 
the generation of a variate from a wrapped Cauchy distribution, we apply a result from 
McCullagh (1996) concerning the Mobius transformation of a circular uniform random 
variable, namely that 

Z~C*(0) ^±L~C(p), \B\<1. (4.6) 

1 + pZ 



An algorithm for generating BC + (ip) random vectors then involves the following steps. 

Step 1: 
Step 2: 

Step 3: 



Generate uniform (0,1) random numbers U\ and U-i. 
Set Zjj = exp(27ti[/i) and Z T = exp(27riJ7 2 )- 



Let Z V = ^U±M2-. 

l+ipZvZ T 



The joint distribution of (Zjj,Zy) is then BC+(ip). In Step 2, Zjj and Zt are in- 
dependent circular uniform random variables. In Step 3, because of property (4.6), 
the conditional distribution of Zy given Z\j = z u is C*(ipz u ). It therefore follows that 
(Zu,Z v )~BC+(1>). 

BC-(jp) random vectors can be simulated using a very similar approach. 



4.4. Parameter estimation 

Here, we consider parameter estimation for the BC+(ip) model based on the method of 
moments and maximum likelihood. Although we discuss parameter estimation for only 
the BC+(ip) here, it is possible to derive the estimates of the parameters for the BC-(ip) 
model by a straightforward modification of the result below. 

First, we consider method of moments estimation based on (4.3). Assume (Zjj,Zv) 
is a BC+(ip) random variable. As discussed in Section 4.2, its theoretical moments are 
given by (4.3) . Suppose (Zjjj , Zyj ) (j = 1 , . . . , n) is a random sample from the BC+ (ip) 
distribution. The method of moments estimator is obtained by equating the theoretical 
and sample moments. We thus obtain 

1 - 

^ = -EW^- (4-7) 
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It follows from the weak law of large numbers that ip is a consistent estimator of ijj. 
In addition, the central limit theorem enables us to prove asymptotic normality of the 
estimator, namely, 



Re(V>)\ _ /Re(V0\l _d. Ar / n 1 r 



, ,\ \ ( — -N< 0,-(l- \ib\)I > asn^oo. 

Although this estimator is different from the method of moments estimator (3.3), these 
estimators are somewhat related. Recall that the relationship (4.1) holds between axg(tp) 
and Q. If dctQ = 1, pQ can then be expressed as 



pQ = 



Re(V>) -Im(V>)\ 
Im(V>) Re(V') ) 



Given this relationship, an estimator of ij) induced naturally from the method of moments 
estimator (3.3) is 

where Uj = (Uji,Uj2)' and Vj = (Vji,Vj2)'- This estimator is equal to the method of 
moments estimator (4.7). 

Second, turning to the maximum likelihood estimation, it is obvious that the max- 
imum likelihood estimator coincides with the method of moments estimator, that is, 
i/j = ZjjxZvi, for a single observation (i.e., when n = 1). When n is large, the estimates 
must be obtained numerically Note that the likelihood function can be written as 



L{if>) ex I 



1-lVf 



This expression suggests that maximum likelihood estimation for the BC+(ip) model es- 
sentially coincides with that for the wrapped Cauchy distribution C*(i/j). We can there- 
fore obtain estimates by applying the algorithm of Kent and Tyler (1988). Since distri- 
bution (4.2) is identifiable and the parameter space, the unit disc, is finite, consistency 
of the maximum likelihood estimator follows from the general theory (see, e.g., Bahadur 
(1971)). The Fisher information matrix for (Re(-0),Im(-0)), denoted by 2"(Re(^),Im(^)), 
is simply expressed as 

J{ReW,ImW}= (1 _^ |2)2 J. 

The above can be obtained by transforming random variables into polar coordinates and 
using (3.616.7) of Gradshteyn and Ryzhik (1994). Therefore, the following hold for the 
maximum likelihood estimator: 



i 
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Table 1. Estimates of relative mean squared errors of the method of moments estimator (4.7) 
with respect to the maximum likelihood estimator 





ip = 0.1 


= 0.3 


^ = 0.5 


tp = 0.7 


•0 = 0.9 


n = 10 


0.919 


0.998 


1.155 


1.620 


4.135 


n = 20 


0.963 


1.032 


1.221 


1.749 


4.767 


n = 30 


0.980 


1.071 


1.229 


1.795 


4.942 


n = 50 


0.977 


1.059 


1.306 


1.827 


5.039 


n= 100 


0.992 


1.105 


1.311 


1.891 


5.088 


n = oo 


1.010 


1.099 


1.333 


1.961 


5.263 



4.5. Simulation study 

In this subsection, a simulation study is carried out to compare the finite sample per- 
formance of the estimators. Here, we discuss two estimators for the parameter ip based 
on the method of moments (4.7) and maximum likelihood. As for the estimator (3.3), 
we do not discuss it here because it is expressed in matrix form and we cannot directly 
compare it with the other two estimators. 

In our simulation study, random samples of sizes n = 10,20,30,50 and 100 from 
BC+(ip) with ip = 0.1,0.3,0.5,0.7 and 0.9 are generated. For each combination of n 
and ip, 2000 random samples are gathered. Random vectors from BC+(ip) are generated 
by using a method introduced in Section 4.3. We employ the Mersenne Twister, which 
is implemented by the command runif in R 2.7. 1, to obtain uniform random variatcs. 

We discuss the performance of the estimators in terms of the estimates of the mean 
squared error. In this case, the estimate of the mean squared error is given by Y^j^i IV'j — 
ip\ 2 /2000, where the ipj's (j = 1, . . . , 2000) are the estimates for ip. 

Estimates of the relative mean squared errors of the method of moments estimator (4.7) 
with respect to the maximum likelihood estimator for some selected values of n and ip 
are given in Table 1. In the table, the relative mean squared errors for n — oo are derived 
from the asymptotic variance of two estimators as (1 — |V>| 2 )/(1 — |V>| 2 ) 2 = 1/(1 — IV'I 2 )- 
The comparison of the relative mean squared errors shows that the maximum likelihood 
estimator provides a better result in many cases, especially for large \ip\ or n. However, 
the difference diminishes as n or \ip\ decreases. In particular, when ip = 0.1 or (ip,n) = 
(0.3, 10), the method of moments estimator displays better performance. 

An advantage of the method of moments estimator (4.7) is its simplicity in calculation. 
Because the estimator is expressed in closed form, it is not necessary to use any numeri- 
cal algorithm to estimate the parameter. In addition, the estimator displays satisfactory 
performance when \ip\ is small. However, the maximum likelihood estimator is consid- 
ered a better estimator in most combinations of ip and n, as seen in Table 1. A small 
drawback of the maximum likelihood estimator is the complexity involved in calculating 
the estimator. Since the maximum likelihood estimator is not expressed in a closed form, 
we need to resort to a numerical method to obtain the estimate for the parameter. 
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5. Related models 



5.1. Generalizations of the model with density (2.1) 

As described in Section 2.1, the model with density (2.1) is generated using Brownian 
motion starting at Bq = 0. In this subsection, we briefly discuss a distribution which 
is generated using Brownian motion starting at Bq — £(||£|| < p) instead of Bo = 0. We 
define a random vector (U, V) = (QB Tl /\\B Tl || , B T2 ) in the same way as was used in 
Section 2.1, except that we incorporate the new starting point. The resulting density for 
(U, V) is given by 

f( s_ _J 1 -P 2 P 2 - IIC1I 2 od-1 

I[U,V > Al„ 1 {l-2pu'Qv + pZy/*{pZ-2pU'Qi + U\\ 2 ) d/21 ' 
The marginals and conditional distribution of V given U = u are the exit distributions: 

[/-Exitd^QO, y~Exit rf (0 and V\(U = u) ~ Exit d (pQ'u). 

The conditional distribution of U given V = v is not of the usual form. This conditional 
distribution can be unimodal or bimodal and is generally skewed, except for certain 
special cases such as v — ±£/||£|| . It can be shown that U and V are independent if and 
only if p = 0. We note that the bivariate circular case of model (5.1) is a submodel of the 
distribution briefly discussed by Kato et al. (2008) as a model related to a circular-circular 
regression model. 

Another generalization arises from the use of the method discussed in Saw (1983), 
Section 4. This method enables us to derive a distribution with prescribed rotationally 
symmetric marginals. 

In the bivariate circular case, it might be promising to apply the Mobius transformation 
to each variable. Let (Zy, Zy) ~ BC+(tp) and define a random vector 

(Zu,Z v )=(^^,^f), K|,H<1. (5.2) 

Then, because of property (4.6), the marginals of Z\j and Zy have wrapped Cauchy 
distributions C*(a\) and C*(a2), respectively. Another benefit of this extension is that 
its density has a simple and exact form, including the normalizing constant which does 
not involve any special functions. 

It is also possible to transform the bivariate circular model into a distribution with 
specified marginals by applying Sklar's theorem in the theory of copulas. (See Nclscn 
(1998), Theorem 2.3.3.) For example, a bivariate distribution with von Mises or, equiva- 
lently, circular normal marginals is constructed as follows. Let (Zjj,Zy) have BC+(ip). 
Assume that (@u, ©v) = (Arg(Zfy), Arg(Zy)), where Arg(z) is the argument of z taking 
values on [0,27t). Suppose that Fj (j = 1,2) are distribution functions of the von Mises 
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distributions vM.(fj,j,Kj), namely. 



F j( e ) = / „ T 7 r exp{/t 3 -cos(t-/Xj-)}dt, (5.3) 

where < fij < 2n, Kj > and Jo(-) denotes the modified Besscl function of the first kind 
and of order 0. Define a distribution by a random vector 

«*.«*)-(*-(£),*T'(£ 

The density for this random vector is of the form 

1 - \ip\ 2 

f(6 u ,9 v ) = — — — — rcxp{/cicos(0„ -/ii) + n 2 cos(0 v -/i 2 )} 

x [1 + |^| 2 - 2\i>\ cos[27t{Fi(0 u ) - F 2 (6 V )} - arg(V-)]]" 1 , (5.4) 

o<e u ,e v <2n. 

It follows from Sklar's theorem that the marginals of Qjj and @v are the von Miscs, 
vM(/ii, Ki) and vM(^2, K 2 ), respectively. As is clear from the derivation, the distribution 
reduces to BC+(ip) when k\ = k 2 = 0. 

Figure 1 plots some contour plots of density (5.4) for fixed values of \i 2 ,k 2 and arg(^) 
and some selected values of and The comparison between Figure l(a)-(c) 

suggests that |?/>| influences dependence between and 0y, and this was mathemati- 
cally validated in Theorem 2. Figures 1(a) and 1(d) imply that the concentration of the 
marginal distributions is controlled by Ki or n 2 . As seen in Figures 1(e) and 1(f), the 
distribution can be skewed when yUi =/= fi 2 . 

It is important to decide on conditions for the independence of two variables for a 
bivariate distribution. The following result provides the parameter configuration which 
yields independence for the bivariate family including models (5.2) and (5.4). 

Theorem 5. Let (Zu,Zv) ~ BC+{ip). Suppose g\ and g 2 are one-to-one mappings de- 
fined on Q, and differ entiable on fi. Then gi(Zjj) and g 2 (Zy) are independent if and only 
if^ = 0. 

Proof. Let (Z U7 Z V ) = (gi(Zu) 7 g 2 (Z v )). The density of (Zu,Z v ) is then calculated as 



f{zu, Zv) 



1 1 - \tf,\ 



2 



47T 2 



ii-vs 2 " (^)sr (^)l 2 



99 1 (Zu) 



di 



dg 2 (z v ) 



dZy 



Zui Zy £ 



Given this, the necessary and sufficient condition for independence is that ipg2 1 (z v )g^ 1 (z u ) 
is a constant or a function which either depends only on z u or only on z v . Since neither 
gi nor g 2 is a constant function, this condition holds if and only if -0 = 0. □ 
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Figure 1. Contour plots of density (5.4) with [i-z = n, «2 = 1.16, arg(^) = and (a) /ii = 71, Ki = 
1.16, \t/j\ = 0; (b) hi = 71, Ki — 1.16, — 0.5; (c) fjii = 7t, ki = 1.16, = 0.8; (d) fii — n, 
Ki = 2.32, \ip\ = 0; (e) fn = 37t/2, «i = 1.16; \ip\ = 0.5; (f) mi = 3tt/2, ki = 1.16, \ip\ = 0.8. 
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5.2. Comparison with existing bivariate circular distributions 

Model (5.4) has some relation to models discussed by SenGupta (2004) and Shieh and 
Johnson (2005). A bivariate circular family related to the von Mises has been considered 
by SenGupta (2004). It has the density 



(1, cos 0„, sin 0„) m 2 i 

\TO31 

o < e a ,e v < 2tt, 




(5.5) 



where mjk (1 <j,k< 3) are the parameters, with mn, a function of the other m^'s, 
being the normalizing constant. This model has the property that both conditionals 
follow the von Mises distributions, a property which our model does not have. On the 
other hand, our model has von Mises marginals, while model (5.5) generally does not. 
This difference comes from the derivations of the models. Model (5.5) is constructed by 
means of the conditional specification, while our model is obtained by transforming a 
distribution, which is generated by Brownian motion, via copula theory, so that both 
marginals have the von Mises distributions. 

Shieh and Johnson (2005) presented a bivariate circular distribution which is called 
the bivariate von Mises distribution in their paper. The density of their model is of the 
form 

f(9u,9v) 



4ti 2 ii-=iM^) 

x exp[«i cos(0„ — fx-y ) + K 2 cos(9 v — fi 2 ) (5.6) 
+ ^cospTt^^) - F 2 (6 V )} - fi 3 }}, < 6 u ,e v < 2tt, 

where < fij < 2n, Kj >0,j= 1, 2, 3, and are the distribution functions of the von 

Mises vM(/jfe, Kfe), fc = 1,2, as defined in (5.3). A property common to their model and 
ours is that both models have the von Mises marginals and belong to a general class of 
distributions presented by Wehrly and Johnson (1980). One difference is the conditional 
distributions of the model and this distinction could make a difference in fit, as we see 
in the next subsection. 



5.3. Application 

Example 1. As an illustrative example in which one of our models is utilized, we 
consider a data set of pairs of wind directions measured at a weather station in 
Texas. The data set is a part of a larger data set which is taken from a website 
http: //data. eol.ucar.edu/codiac/dss/id=85. 034. The original data set contains 
hourly resolution surface meteorological data from the Texas Natural Resources Con- 
servation Commission Air Quality Monitoring Network. (These data are provided by 
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Table 2. The maximized log-likelihood, AIC and BIC values 
of the proposed model (5.4) and two existing models (5.5) and 
(5.6) fitted to wind directions at 6 a.m. and 7 a.m. 



Model 


logL 


AIC 


BIC 


(5.4) 


-65.9 


143.8 


152.2 


(5.5) 


-68.2 


152.4 


163.6 


(5.6) 


-70.9 


153.8 


162.2 



NCAR/EOL with the support of the National Science Foundation.) Among this data 
set, we focus on 30 pairs of wind directions at 6 a.m. (9 U ) and 7 a.m. {0 V ) measured each 
day at a weather station, which is denoted C28_l in the data set, from June 1 to June 
30, 2003. 

Figure 2(a) shows a planer of the dataset. This frame suggests that there is depen- 
dence between wind directions at 6 a.m. and 7 a.m. We fit models (5.4), (5.5) and (5.6) 
to the data set based on maximum likelihood estimation. To estimate the parameters 
numerically, we employ the PORT routine which is an optimization method carried out 
using nlminb in R 2.7.1. 

Table 2 shows the maximum log-likelihood, AIC and BIC values of the fitted models. 
According to the AIC and BIC criteria, our model (5.4) provides the best fit of all. Model 
(5.5) is the second best model judging from the AIC criterion, while the BIC criterion 
indicates that model (5.6) is the second best. The estimated parameters of model (5.4) 
are given by £1 = 1.89, fi 2 = 2.01, fci = 1.03, k 2 = 1.19, arg(V0 = 6.24 and = 0.75. The 
fitted density of model (5.4) is displayed in Figure 2(b) which seems to show a satisfactory 




Figure 2. (a) Planar plot of the wind directions at 6 a.m. and 7 a.m.; (b) contour plot of the 
density for model (5.4) fitted to the data. 



A distribution for a pair of unit vectors 



915 



Table 3. The maximized log-likelihood, AIC and BIC values 
of the proposed model (5.4) and two existing models (5.5) and 
(5.6) fitted to wind directions at 6 a.m. and 6 p.m. 



Model 


logL 


AIC 


BIC 


(5.4) 


-89.8 


191.6 


200.0 


(5.5) 


-82.0 


180.0 


191.2 


(5.6) 


-89.9 


191.8 


200.2 



fit of the model to the data set. Since the parameter which controls the dependence 
between two circular variables, is fairly large, it seems that the wind directions at 6 
a.m. and 7 a.m. arc strongly associated. Also, note that the argument of ip is close 
to zero, implying that the mean direction of the wind directions at 6 a.m. is close to 
that at 7 a.m. In other words, the mean direction of Fi(O^) — i^Oy) is nearly zero. 
These results correspond to our intuition that wind directions usually do not change 
dramatically within one hour. 

Example 2. The second example concerns another 30 pairs of wind directions measured 
at the same weather station as in Example 1. This time we focus on wind directions at 
6 a.m. (9 U ) and 6 p.m. (9 V ), observed from June 1 to June 30, 2003. Figure 3(a) shows 
a planar plot of the data set; it seems that the dependence structure between the two 
variables is not as clear as in the previous example. Again, we use models (5.4), (5.5) 
and (5.6) to model the data set. 




Figure 3. (a) Planar plot of the wind directions at 6 a.m. and 6 p.m.; (b) contour plot of the 
density for model (5.5) fitted to the data. 
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The maximum log-likelihood, AIC and BIC values of the three fitted models are 
given in Table 3. From AIC and BIC criteria, model (5.5) is the best of all. The fit- 
ted density of model (5.5) is displayed in Figure 3(b). The proposed model is the sec- 
ond best, but there is no significant difference in fit from Shieh and Johnson's model 
(5.6). The maximum likelihood estimates of the parameters of our model are given by 
£i = 2.29, £ 2 = 1-43, Ki = 1.33, k 2 = 0.222, axgty) = 0.144 and |^| = 0.544. Note that the 
estimate \ip\ in this example is smaller than that in the previous one. This suggests that 
there is less association between wind directions at 6 a.m. and 6 p.m. than between those 
at 6 a.m. and 7 a.m. However the estimate of arg(^) in this example is also close to 
zero, meaning that the mean direction of the wind directions did not make a big change 
although twelve hours have passed since the first observation at 6 a.m. 

In Example 1, which deals with a data set with clear dependence structure, the AIC 
and BIC criteria suggest that our model (5.4) is the best. The example implies that the 
proposed model (5.4) is suitable to fit bivariate circular data if Fi(®u) — ^(©v) nas a 
unimodal and symmetric shape. On the other hand, model (5.5) is recommended for a 
data set which shows a different kind of association between variables, as seen in Example 
2. One advantage of model (5.5) is that it is a flexible eight-parameter model having von 
Mises conditionals and seems to have more potential to fit various kinds of bivariate 
circular data because of its flexibility. Model (5.6) has some properties common to our 
model; for example, the marginals of both distributions are the von Mises. However, 
these models have different conditionals and this difference can produce considerable 
distinction in fit, as demonstrated in Example 1. 



6. Related distributions on R 2 and on the cylinder 

In previous sections, we have dealt with distributions for two directional observations. In 
this subsection, we provide models for two other manifolds, namely, R 2 and the cylinder. 

By applying bilinear fractional transformations to model (4.2), a distribution on R 2 is 
constructed. Let (Zjj,Zv) be distributed as BC-(ip)- Define a random vector (X,Y) as 

X = \ and Y = 1 . 

1 + Z v 1 + Z v 

Clearly, (X, Y) takes values in R 2 . It is straightforward to show that the joint density 
for (X, Y) is 

lm(0) 

7t 2 \x + y + 0(1 - xy)\< 

where 6 = i(l - ip)/(l + V0- Since \ip\ < 1, it is evident that lm(0) > 0. 
This model has the following properties: 



f^V) = - 2 T——^—-TT 2 , (6.1) 



X~C(i), Y~C(i), 



1-6*2/7 \l-9x 
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where the C(<f>) notation is derived from McCullagh (1992) and denotes the Cauchy 
distribution on the real line with location parameter Re(0) and scale parameter Im(</>). 
Thus, the marginals and conditionals are members of the real Cauchy family. Further 
properties of model (6.1) are derived using the inverse transformations Z\j = (1+LY)/(1 — 
iX) and Zy = (l+iF)/(l— iY) which map the real line onto the unit circle in the complex 
plane. 

A related distribution on the cylinder Q x R is obtained in a similar fashion. Let 
(Zjj,Zv) be BC +(V>) distributions. Define a random vector 

(Ze,X)=(^,iI^). (6.2) 

The marginals and conditionals of (Zq,X) are then 

Z @ ~C*(0), X~C{i), 

ZeK x _.) ~c- (i±|v), x V z e . w) . c (_,JzW). 

Thus, the marginals are circular uniform and standard Cauchy, while the conditionals 
are the wrapped Cauchy and linear Cauchy distributions, respectively. 

In a manner similar to that in Section 5.1, one can transform the model BS+(ip) 
into a family of cylindrical distributions having prescribed marginals as follows. Let 
(Zu, Zy) ~ BS+{iIj) and express these variables in terms of radians, that is, {®Ui ©v) = 
{kxg{Zjj) 1 Arg(Zy)). Suppose that Fq and Fx are distribution functions of any circular 
and linear distributions, respectively, and are strictly increasing. A random vector defined 
by (9, X) = (i ;l 1 (e [ //(27t)),i^ 1 (ey/(27r))) then follows a distribution on the cylinder 
which has marginals with distribution functions -Fe and Fx ■ For instance, if we assume 
that Fq and Fx are distribution functions of the von Mises and the normal distribution, 
respectively, we can construct a distribution with von Mises and normal marginals. 

A straightforward modification of Theorem 5 yields parameter configuration for inde- 
pendence between two variables for the distributions presented in this subsection. For 
example, if a random vector (Zq, X) is defined as in (6.2), then Zq and V are independent 
if and only if tp = 0. 

Appendix A: Derivation of density (2.1) 

Let c(u,v) be the joint density of (U,V) = (QB Tl /\\B Tl ||, B T2 ), which is defined in the 
same way as in Section 2.1. Note that if the density for (U, V) exists, it can be expressed 
as 

c(u,v) = fu(u)g v \u(v\u), it,weS' d ~ 1 , 

where fu is a density for the marginal of U and gv\u that for the conditional of V given 
U = u. Clearly, the marginal of U is distributed as the uniform distribution and thus 
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fu( u ) = Because of the Markov property of Brownian motion, the conditional of 

V given U = u is essentially equivalent to the exit distribution for the sphere generated by 
Brownian motion starting at B = pQ'u (see Durrett (1984), Section 1.10). The density 
for the exit distribution for the sphere is known to be 

9v ^ vlu)= ^\\v-~ P Q< u \r veSd "- 

We thus obtain the density (2.1). 

Density (5.1) is obtained by a straightforward modification of the above. 



Appendix B: Proofs of Theorems 2—4 



Proof of Theorem 2. Since the marginals of U and V are uniformly distributed on 
the sphere, it is evident that E(U) = E(V) = and E(UU') = E(W) = d^I. 

We show that E(UV) = d^ 1 pi. Because model (2.1) is 0(d)-symmetric in the sense of 
Rivest (1988), calculation of E(UV) is simplified by applying Proposition 1 of his paper 
to 

E(UV') = dmg{E(R j S j )}Q, 
where (R, S) ~ BS d (pI),R = (R U ..., R d )' , S=(S U ..., S d )' . Consider the integral 

E(R 1 S 1 )= [ r 1 s 1 c{r,s)drds= [ -p— [ -p— tt — ^Tui dsdr - 

Transforming S into S = PS, where P is a d x d orthogonal matrix such that P = 
(r,p 2 , . . . ,Pd)',Pj = (Pji, ■ ■ ■ ,Pjd)' e we have 

n f si i-p 2 

■ dsdr 



i-i Ad-i Js*-* A d-x pr\\~ 

d _! A d -i J 8 *-i Aa-! (\-2p~ Sl +P 2 ) d ' 2 

_£i l —P^ d s 

d -i dA d . x {l-2ps 1 +p 2 ) d / 2 ■ 

The last equality follows from E{R) = and E{R\) = d~ l . 

Then, from the fact that X ~ H'(9, v) implies E(X) = 9 (McCullagh (1989)), the above 
equation can be expressed as 



d -± dAd-i (1 - 2psx + p 2 ) d / 2 
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1-p 2 27t< d - 1 )/ 2 f n cos0sin d - 2 

do 



dA d -i T{(d - l)/2} J (1 - 2pcos0 + p 2 ) d l 2 

1-p 2 f 1 t(l -t 2 )( d - 3 )/ 2 



dB{(d - 1)/2, 1/2} y_ x (i - 2 P t + P 2 yi 



■At 



= p 
d' 

The other elements, E(RjSj) (2<j< d), are calculated in a similar way. □ 

Proof of Theorem 3. The distribution function of T, say H , is given by 

H(t) = P{T <t)= E v {P(U'Qv < t\V = v)} 
= Ey{P(U'v<t\V = v)}, 

where V = QV . Define U = PU, where P g 0(d) such that P = (v,p 2 , ■ ■ -,Pd)',Pj € R d , 
and one obtains 

Ey{P{U'v<t\V = v)}= [ [ g( Ul )dudv 

27T (rf-l)/2 



^-i rJf , nm / . 9 (cos0)sin d - 2 0d0 

1 ^0 — Ij/Zj- J cos 6l<t 

o<e<7t 



Thus, 

= A d -i ■ Ad- 2 g{t){l - t 2 )^' 2 , -1<*<1. 



□ 



Proof of Theorem 4. It is clear from (3.2) that pQ is an unbiased estimator of pQ. 
Consistency of the estimator follows from the weak law of large numbers. The use of 
the central limit theorem enables us to prove the asymptotic normality. Here, we show 
that the variance-covariance matrix of g(pQ) is given by a matrix with entries (3.4). 
Suppose that (U, V) ~ BS d (pQ),U = (U U ..., U d )' and V = (V u . . ., V d )' . To calculate 
the variance-covariance matrix, we first consider d 2 E(UiVjUkVi) for 1 < i,j,k,l < d. It 
can be expressed as 

d 2 E(UiVjU k Vi) = d 2 E u {U l U k E v \ u {V ] Vi\U = «)} 

= d 2 Eu[U t U k E vw {b' 3 diag(V?)6,|tf = u}], 

where V = (Vi, . . . , V d )' = PQV, bj is the jth column of PQ and P is any d x d or- 
thogonal matrix such that the first row of P is u' . McCullagh (1989) showed that if 
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X~H'(9,v),thcnE(X 2 ) = {l + (2v+l)9 2 }/{2(v+l)},&nd that f\(l -x 2 )"- 1 ' 2 /{{l- 
29x + 9 2 )B(v + i, ^)}da; = 1. Using these results, we have 

1 — 2 

i^{diag(T/ 2 )|£/ = u} = —^-1 + p 2 A u 

where A k = (Sij) is a d x d matrix whose entries are given by Sij = 1 for = (fe, k) 
and Sij = otherwise. Therefore, 

d 2 E(U t V J U k V l ) = d 2 E u \u l u k q' J + p 2 UU^j a j, 

where qj is the jth column of Q. If i = k and j = 1, we have 

dn 2 

d 2 E(U l V J U k V l ) = (1 - p 2 ) + + 24). 

The above follows from the fact that E V (UU') = d" 1 / and E u {U l UlJ r ) = (2A t + 
I)/{d(d+ 2)}. If i ^ k or j ^ I, it is fairly easy to show that 



do 2 

d 2 E{U l V 1 U k V l ) = d 2 p 2 q / J E u (U l U k UU')q l = -J^( qij q kl + q kj q u ). 



Thus, we obtain d 2 E{UiVjU k V{) for any 1 < i,j,k,l < d. On the other hand, it follows 
immediately from (3.2) that dE(UiVj) = pqij. Summarizing these results, we obtain The- 
orem 4. □ 
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